Model Selection

384 High Resolution

# 384 High Resolution

Vit Base Patch16 Siglip 384.webli

Vision Transformer model based on SigLIP, containing only the image encoder part, using original attention pooling mechanism

Image Classification

Deit Base Patch16 384

DeiT is an efficiently trained Vision Transformer model, pre-trained and fine-tuned on the ImageNet-1k dataset at 384x384 resolution, suitable for image classification tasks.

Image Classification

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase